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ABSTRACT KEYWORDS 

The purpose of this study was to examine whether technol- _—Educational media; 
ogy-based learning environments have the potential to preschool; vocabulary; dual- 
support dual-language learners’ (DLLs) vocabulary learning _‘[@nguage learners; dual- 

: : d : : coding theory; eye-tracking 
in their less dominant language. Interrogating Dual-Coding 

Theory (Paivio, 1986), this study investigates whether DLLs 

benefit from media content that is delivered both orally 

and visually, and uses English language proficiency as an 

important contextual factor that might impact vocabulary 

learning on screens. Adopting a within-subjects design on 

43 preschool-aged DLLs, and using eye-tracking technology 

to monitor children’s attention, this study finds that DLLs 

are able to identify more words that are taught on screen 

when information is dual-coded, particularly if they have 

lower English language proficiency. Implications for the 

field of computer-assisted language learning are discussed. 


Children today are exposed to screens at a very young age, watching 
educational media programs that promise to foster early literacy skills 
well before they set foot in school (Lemish, 2015; Neuman, Wong, Flynn, 
& Kaefer, 2019). These educational programs that are marketed to pro- 
vide a head start may be particularly helpful for young dual-language 
learners (DLLs) who come from households that speak a language other 
than the dominant language of school and society at home. Facilitating 
the bilingual development of young DLLs, media has the potential to 
provide children with ample exposure to new vocabulary words, using 
bells and whistles that draw their attention to key learning experiences 
(Kirkorian & Anderson, 2008; Verhallen, Bus, & de Jong, 2006). Scholars 
continue to explore the mechanisms that facilitate learning from screens, 
now extending the literature to examine how media can help children 
learn a new or additional language. 
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A prominent theory underpinning learning from screen media in the 
field of computer-assisted language learning (CALL) is dual-coding the- 
ory (Paivio, 1986), which proposes that when information is transmitted 
through verbal (speech) and nonverbal (visual) channels, it is represented 
more fully, leading to stronger comprehension and greater recall (Mayer, 
1997). While this theory is adopted in numerous studies with monolin- 
gual preschool populations, dual-coding theory has not been explicitly 
investigated among DLLs in media contexts, particularly for children in 
the preschool years. 

Consequently, the purpose of the current study is to investigate the 
potential of educational media to provide preschool-aged DLLs with rich 
scaffolds that facilitate word learning in a second language. Specifically, 
this study aims to examine the importance of dual-coding theory for 
DLL viewers learning words in a second language. It also investigates 
how two key language factors for DLLs, child vocabulary in a second 
language (L2) and parental L2 language ability, differentially impact 
dual-coding mechanisms when viewing educational media. 


Educational media and dual-language learners 


Young children are in front of screens viewing educational media for 
long periods of time, leading many around the world to study educa- 
tional media as a means of improving early literacy (Lemish, 2015). In 
the United States, DLLs are spending an average of two hours or more 
on screen per day (Rideout, 2014), watching programs that are marketed 
for bilingual learning (Wong & Neuman, 2019). Scholars document the 
benefits of educational media for preschoolers, demonstrating gains in 
early literacy, vocabulary, and problem solving (Anderson & Kirkorian, 
2015). Still, others warn that toddlers who are three-years-old and under 
might not have the capacity to learn from screened platforms because of 
a video deficit (Anderson & Pempek, 2005), described as the discrepancy 
between learning from a live person and learning from a screened plat- 
form. In response, the American Academy of Pediatrics (AAP) has issued 
policy statements that recommend limiting television exposure to young 
children. Despite this recommendation, national surveys of media con- 
sumption in the United States report that 73% of 2-4 year-olds watch 
television every day for an average of 1.9hours per day (Common Sense 
Media, 2013). As media plays an increasingly central role in the lives of 
young people, scholars continue to examine how media can be strategic- 
ally used to support learning, particularly among vulnerable populations 
like DLLs. 
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DLLs are defined in the current study as children from the ages of 
birth to 5years old who are learning two or more languages at the 
same time, which include a second language while still developing 
their first or home language (Takanishi & Le Menestrel, 2017). In the 
United States, DLLs are currently the fastest growing population in 
schools, with the largest representation of DLLs speaking Spanish as a 
first language and English as a second language (Capps, 2015; 
Connor, Cohn, Gonzalez-Barrerra, & Oates, 2013). DLLs grow up in 
households with varying amounts of exposure to a first and second 
language. In the early childhood years before formal schooling occurs, 
DLLs are primarily immersed in the language(s) spoken by parents, 
which, in the U.S. context, is often a language other than English 
(Hammer et al., 2014). With less English exposure than their mono- 
lingual counterparts who grow up speaking English or the dominant 
language of schools, preschool-aged DLLs are often more proficient in 
their home language (L1) than they are in English (L2). As a result, 
DLLs often perform below their monolingual peers in English vocabu- 
lary development (Hammer et al., 2014) because schools demand that 
content is learned in English. Although these differences are estab- 
lished in the first years of schooling, longitudinal studies suggest that 
English vocabulary is critical in the early years, which suggests DLLs 
may encounter challenges in their educational trajectory (Halle, Hair, 
Wandner, McNamara, & Chien, 2012; Han, 2012). 

Therefore, to better understand how young children can best learn 
new words, a meta-analysis found that using educational media as an 
instructional tool was associated with significant gains in vocabulary 
knowledge (Marulis & Neuman, 2013). Although this meta-analysis 
did not specifically address word learning in a second language, these 
positive developments in vocabulary learning are also reported in DLL 
classrooms when teachers provide multimedia-rich instruction (i.e. video 
clips about vocabulary words or animated e-books) to young learners 
(Silverman & Hines, 2009; Verhallen et al., 2006). Still, the specific attrib- 
utes of effective media instruction remain largely unknown. In response, 
scholars are now examining the pedagogical supports on screen that 
might facilitate vocabulary learning in young children (Danielson, Wong, 
& Neuman, 2019; Larson & Rahn, 2015; Linebarger & Piotrowski, 2010; 
Neuman et al., 2019; Teng, 2019). A more recent study that examined 
200 online streamed programs found that the use of ostensive (defin- 
itional) and attention-directing cues were the most salient screen-based 
pedagogical supports used to promote vocabulary learning in young 
children’s programming (Neuman et al., 2019). Similarly, a content ana- 
lysis of bilingual programming demonstrated that visual supports and 
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repetitions were common screen-based pedagogical supports for vocabu- 
lary learning in a second language (Wong & Neuman, 2019). 


Repetition in media to support vocabulary learning 


Relatedly, the use of repetition and visual supports are two pedagogical 
scaffolds that are well-aligned with the research on learning vocabulary 
in a second language. Looking first at the use of repetition, scholars 
agree that DLLs benefit from consistent and repeated exposure to lan- 
guages (Cha & Goldenberg, 2015; Hammer et al. 2014; Hammer, 
Lawrence, & Miccio, 2008; Quiroz, Snow, & Zhao, 2010; Thordardottir, 
2011). Children’s bilingual vocabulary development is closely related to 
the breadth of vocabulary words that they are exposed to in each lan- 
guage, as well as the frequency of encountering these vocabularies in 
each language in the home, school, or community context. While the 
rate of language development is related to the amount of speech mono- 
lingual children hear in that language (Hoff, 2006), bilingual infants 
(Hoff et al., 2012) and preschoolers (Hammer et al., 2008) develop their 
L1 and L2 vocabulary relative to the amount of input in each language. 

Like previous studies, Uchikoshi (2006) also examined children’s 
exposure to second language in the home and school environment, but 
uniquely investigated how viewing educational media in a second lan- 
guage affected vocabulary growth. Applying growth modeling analyses to 
150 Latino-English DLLs, Uchikoshi found that children who viewed two 
educational media programs - Arthur and Between the Lions - at home 
had steeper growth trajectories than those who did not, suggesting that 
broad exposure to a language in media is beneficial for L2 vocabu- 
lary learning. 


Visual supports in media to support vocabulary learning 


Visual supports, which include visual representations of vocabulary 
words, illustrations, demonstrations, or multimedia, can serve as essential 
scaffolds for dual language vocabulary learning. The need for visuals is 
apparent in a number of successful interventions in early childhood set- 
tings, suggesting that visuals provide DLLs with the supports needed to 
make core content comprehensible (Leacox & Jackson, 2014; Silverman 
& Hines, 2009; Takanishi & Le Menestrel, 2017). Silverman and Hines 
(2009) compared DLL and non-DLL populations in preschool to second 
grade to understand how traditional and multimedia-enhanced vocabu- 
lary instruction differentially affected learners. Multimedia-enhanced 
instruction involved short, 5-minute video clips that were topically 
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related to the storybooks and provided rich visual representations of tar- 
get words. Findings demonstrated that these visual representations scaf- 
folded vocabulary instruction for DLLs, providing them with significant 
gains in vocabulary knowledge that were unique to the DLL population. 

Similarly, using a within-subjects design on 24 Spanish-speaking pre- 
schoolers and kindergarteners, Leacox and Jackson (2014) used technol- 
ogy-enhanced e-books that pictured target words on one side of the 
screen and provided a short definition of the word in Spanish when 
clicked. These pictures appeared three times for each target word 
throughout the e-book, and yielded more word learning gains than in 
the control, adult storybook reading condition. Studies like these argue 
that having access to the meaning of new words through visual scaffolds 
helps reinforce vocabulary concepts, deepen vocabulary knowledge and 
support oral language development in young DLLs (Gersten & Baker, 
2000; Leacox & Jackson, 2014; Silverman & Hines, 2009; Takanishi & Le 
Menestrel, 2017; Teng, 2019). 

Many scholars agree that multimedia may be an appropriate platform 
to provide L2 vocabulary instruction to DLLs, using clear verbal and vis- 
ual scaffolds to support vocabulary learning and deliver ample and 
repeated exposure to new words (Neuman et al., 2019; Silverman & 
Hines, 2009; Uchikoshi, 2006; Verhallen et al., 2006). The purpose of this 
paper is to extend this understanding of how DLLs might learn new 
words from educational media by closely examining dual-coding theory 
(Paivio, 1986). More specifically, it seeks to interrogate and establish 
whether dual-coding theory applies to DLLs who are acquiring vocabu- 
lary in a second language. 


Theoretical framework 


Dual-coding theory (Paivio, 1986, 2008) is a theory of cognition often 
used in CALL-related research, which asserts that the formation of men- 
tal images facilitates learning. When information is processed in the 
brain, activity occurs in two distinct subsystems - a verbal system speci- 
alized in processing language, and a nonverbal system specialized in non- 
linguistic imagery (see Figure 1). When information is simultaneously 
transmitted through verbal and nonverbal channels, dual-coding theory 
proposes that nonverbal information can help young children compre- 
hend unfamiliar language like vocabulary and complex grammar. 
Inversely, verbal information may help children process information that 
is presented in unfamiliar images. In other words, information is repre- 
sented more fully in memory when it is coded through two channels 
instead of one. Early studies of dual-coding theory demonstrate the 
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Figure 1. Dual-coding theory (Paivio, 1986). 


additive effects of two sources of input on memory recall when target 
items are simultaneously presented with pictures and their names 
(Paivio, 1986). Relatively recent scholarship continues to support the the- 
ory by using functional brain evidence to demonstrate how mental repre- 
sentations are modality specific and multimodal (Paivio, 2010). 
Dual-coding theory may be particularly applicable for DLL populations 
who are learning vocabulary words in a second language. When DLLs 
process information through two channels rather than one, they might 
benefit from an additional or compensatory scaffold that supports L2 
vocabulary learning. Aligned with the extant literature on language learn- 
ing, DLLs benefit from clear and explicit definitions of words (Carlo 
et al, 2004) and visual images that scaffold their understanding of 
vocabulary words in a second language (Gersten & Baker, 2000). 
Although the two systems operate independently, dual-coding proposes 
that the interconnections between the verbal and nonverbal systems 
trigger activity in one another to help build a coherent mental repre- 
sentation of a specific stimuli or learning experience. Providing lin- 
guistically diverse children with both verbal and nonverbal input may, 
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therefore, lead to stronger mental representations of information, 
which can influence comprehension and provide greater information 
recall (Mayer, 1997). 

Drawing from this theory, educational screen media has the potential 
to support DLLs’ vocabulary acquisition by offering multimodal, robust 
representations of information on the same topic. This means that 
watching educational media may facilitate learning by developing a rela- 
tively multidimensional and extensive understanding of new words and 
their meanings. Prior studies with bilingual learners have used dual-cod- 
ing theory as a lens for understanding how children might learn from 
media contexts (Silverman & Hines, 2009), or how coding in two differ- 
ent languages (verbal) and images (non-verbal) might enhance memory 
recall (Paivio & Lambert, 1981). More recently, scholars consider how 
dually-coded or multimodal annotations that provide pictorial and verbal 
clarifications to viewers may also result in increased attention to the 
screen, which is a predictor for language learning (Boers, Warren, 
Grimshaw, & Siyanova-Chanturia, 2017). Yet, no studies have specifically 
tested how dual-coding theory applies to DLLs when learning content in 
media. Moreover, affordances of media demonstrate the potential to ori- 
ent attention (Salomon, 1981), reduce cognitive demands (Sharp et al., 
1995) and motivate knowledge-seeking (Kamil, Intrator, & Kim, 2000). 
Together, this suggests that educational screen media may be a powerful 
mechanism for cultivating vocabulary development for DLLs in the early 
childhood years. Still, research on learning from media primarily exists 
in young children learning vocabulary in their first language; no research 
to date examines whether dual-coding theory might be applicable to the 
early lexical development of young DLLs as well. 


The present study 


The aim of this study, therefore, was to investigate whether dual-coding 
theory is a key mechanism underlying learning from screens among 
preschool-aged DLLs. We hypothesize that because educational media 
are able to provide dynamic visual and auditory sources of input to 
DLLs, they serve as compensatory scaffolds for DLLs who may 
have limited experience with the language at home. To investigate this, 
children viewed six media clips in two different conditions: dual-coding 
and non-dual-coding video clips. In the dual-coding condition, vocabu- 
lary words were presented to children with visual and auditory congru- 
ence, whereby words were introduced on screen in tandem with an 
image or visual representation of the word. The non-dual-coding condi- 
tion, on the other hand, provided new vocabulary words with visual 
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and auditory incongruence, where words were introduced on screen at 
a different time point from an image of the word. In other words, while 
both conditions included visual and auditory representations of the 
vocabulary words on screen, they occurred simultaneously in the dual- 
coding condition and with temporal distance in time in the non-dual- 
coding condition. 

In addition, we were interested in understanding how specific language 
factors might influence vocabulary learning on screen. More specifically, 
we examined how two potential language factors for DLLs, child baseline 
vocabulary in a second language (L2) and parental L2 language ability, 
might differentially impact dual-coding mechanisms when viewing edu- 
cational media. Finally, to more precisely understand how visual sup- 
ports on screen affected children’s attention and vocabulary learning, we 
used eye-tracking technology to examine attention to relevant vocabulary 
on screen. The specific research questions guiding the current study were 
as follows: 


1. To what extent does dual-coding facilitate L2 vocabulary learning 
among DLLs? 

2. What is the influence of child and parental language factors, as well 
as attention to screen, on L2 vocabulary learning in educa- 
tional media? 


Method 
Participants 


To be eligible for this study, children had to be between the ages of four 
to five years old from households where a language other than English 
was spoken. With the help of education directors and teachers, 44 DLL 
participants were invited to participate in the study. IRB approval was 
attained and consent was obtained from the children and parents. A total 
of 43 (97.7%) provided consent. Table 1 describes the final sample, 
which consisted predominantly of children from native Spanish-speaking 
households (N=35) with an average age of 54.7months (SDage = 
6.63 months). In the sample, 46.5% were female, 81.4% were Hispanic, 
9.3% were African-American, and 9.3% were Haitian. All children were 
enrolled in two Head Start centers in a large urban city and qualified for 
free and reduced lunch. The language of instruction in the Head Start 
centers was English, the children’s L2. A home language environment 
questionnaire (LEQ) was provided to parents and teachers to capture 
children’s exposure to L1 and L2 languages in the home environment. 
This questionnaire was adapted from the Alberta Language Environment 
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Questionnaire (Paradis, 2011) and a bilingual questionnaire developed by 
Luk and Bialystok (2013). The LEQ also served as a screening tool. 
Children who had home language environment scores that indicated an 
emergent level of receptive English (L2) language skills were eliminated 
from the study as they were unlikely to comprehend video clips that pre- 
sented vocabulary words in English. Table 1 provides descriptive statis- 
tics of participants. 


Research design 


We used a within-subjects design to examine the effects of dual-coding 
on vocabulary learning in DLLs. In this type of design, each participant 
received both dual-coding (visual-auditory congruence) and non-dual- 
coding (visual-auditory incongruence) conditions. Children watched a 
total of six video clips from Sesame Street. Three video clips taught new 
words with visual-auditory congruence where a visual representation of 
a vocabulary word was paired simultaneously with an auditory label. 
Likewise, three video clips taught vocabulary with visual-auditory incon- 
gruence, whereby visual and auditory labels occurred at different time 
points. The condition order and clips used in each condition were coun- 
terbalanced between participants. 

A within-subjects design was deemed appropriate for this study 
because it controls for between-subjects variability. This is particularly 
important because children may respond to screens differently and have 
visual attention patterns that vary from one child to the next. This 
design reduces error and increases power to detect potential differences 
between conditions. Second, threats to a carry-over effect are minimal 
because six different video clips will be examined. Lastly, because 


Table 1. Descriptive statistics of sample (N = 43). 


Sample statistics 


Gender 46.5% Female 
Age (months) 54.7 (6.63) 
Race 

African-American 9.3% 

Haitian 9.3% 

Hispanic 81.4% 

Primary language of home and child’s L1 
Fulani 9.3% 

Haitian Creole 9.3% 
Spanish 81.4% 

Qualify for free and reduced lunch 100% 

PPVT standard score 81.04 (11.82) 
High median split 90.52 (8.31) 
Low median split 71.14 (4.42) 

LEQ 2.5 (1.42) 
High median split 3.77 (.43) 


Low median split 1.23 (.75) 
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participants essentially serve as their own controls, a within-subjects 
design will account for significant threats to internal validity. 


Video stimuli 


Programs were selected from a content analysis of educational media 
designed for young preschoolers (Neuman et al. 2019). This content 
analysis identified screen-based pedagogical supports for vocabulary 
learning on screen, categorizing these supports as ostensive (definitional) 
cues and attention-directing cues. The video clips for this current study 
used one-minute segments of the program Sesame Street, which taught 
vocabulary words using pedagogical supports common in educational 
media marketed for DLLs (Wong & Neuman, 2019). Using vocabulary 
clips from the media marketplace enhanced the ecological validity of the 
current study; at the same time, it limited video clip and vocabulary 
word options, requiring a thorough examination of available media to 
select comparable words across conditions. To enhance comparability 
across conditions, vocabulary clips included a straightforward definition 
of the vocabulary word with accompanying visual supports (see Table 2). 

Videos were organized into two conditions: dual-coding (visual- 
auditory congruence) and non-dual-coding (visual-auditory incongru- 
ence). A total of six video clips, three from each condition, were 
obtained from the same program (Sesame Street) to avoid a program 
effect as children may pay more attention to a program that they pre- 
fer. For each condition, we selected vocabulary words that provided 
explicit verbal and visual input simultaneously and also with temporal 
distance according to the dual-coding condition (see Table 2). For 
example, in one clip, an athlete appeared on screen as a character 
said, “That’s an athlete, Elmo!’ (visual-auditory congruence); In 
another clip, a shelter was described and talked about by a 
character followed by a visual support of a shelter at a later time 
point (visual-auditory incongruence). All vocabulary words included 
visual representations on screen, including ‘comfort’ where a woman 
comforted a group of chickens by putting her arms around them, and 
‘dusk’ where the camera panned to a tranquil, empty street scene with 
the afterglow of a sunset that quickly darkened. 

To ensure consistency across conditions, video clips were manipulated 
to be comparable in length, focusing on teaching one vocabulary word. 
These vocabulary words had a similar level of difficulty according to the 
CHILDES database (MacWhinney, 2014). The CHILDES database con- 
sists of more than 5000 transcriptions of adult-child spoken interactions 
in home and laboratory settings. With approximately 3,500,000 words in 
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Table 2. Description of vocabulary clips. 


Program Vocabulary CHILDES 
Condition episode Duration word Definition in media word frequency 
Dual-coding C is for Cooking 0:20 Grater ‘A grater is something 4 
that you stir with.’ 
Friends to 0:28 Hurricane ‘A hurricane is a very, 0 
the Rescue very big storm 


with lots and lots 
of wind and rain.’ 


Be a Good Sport 0:25 Athlete ‘An athlete is 3 
someone who runs 
and jumps 
and throws.’ 
Non dual-coding Wild Words and 0:20 Shelter ‘A shelter is a place 0 
Outdoor where | can sleep, 
Adventures where | can stay 


warm and dry and 
protected from 
the elements.’ 
Being Brave 0:24 Comfort ‘Comfort is when you 2 
sit close with your 
arms around 
someone, help 
them feel better.’ 


Firefly Fun and 0:20 Dusk ‘Dusk is the time of 0 
Buggie day when it’s 
Buddies getting darker 


outside and it’s 
almost night time.’ 


the database, words selected in this study occurred less than five times in 
the utterances of 48-month-old children, indicating a word is challenging 
and unlikely to be known by preschool-aged DLLs. Although challenging 
to select words that are perfectly comparable to one another, using a 
within-subjects design and counterbalancing children’s exposure to each 
condition helped account for differences in word difficulty. Videos were 
also manipulated so that words were repeated the same number of times 
in all conditions. Moreover, words were piloted before the beginning of 
the study and a screening measure was used to ensure children included 
in the study did not know the words taught in the media clips. 


Measures 


Screening measure 

Prior to the study, children were administered a brief 10-item screening 
measure. This included a picture of each of the six words where the asses- 
sor asked children, “What is this?’ Children who accurately identified one 
or more of the words were screened and not included in the study. 


Language environment questionnaire 
To better understand the language environment that children are 
immersed in at home, we assessed parental English language ability using 
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an adapted version of the Language Environment Questionnaire (LEQ; 
Paradis, 2011). The LEQ included eight items. These questions asked 
about how much English children’s parents spoke using a five-point scale 
(5-very fluent/very comfortable speaking about child to 0-not fluent/no 
understanding). A composite score was calculated for each LEQ. 


Peabody picture vocabulary test-IV (Dunn & Dunn, 2007) 

The Peabody picture vocabulary test (PPVT) served as a baseline for 
general English receptive vocabulary knowledge. It is an individually 
administered, norm-referenced test designed to be a valid and reliable 
measure of receptive language skills. Reliability of the standardized 
assessment ranges from .91 to .94. For this study, raw scores were 
converted to age-related standard scores. The PPVT has also been used 
as a predictor of academic skills for preschool-aged children whose 
primary languages are Spanish or English (Burchinal, Field, Lopez, 
Howes, & Pianta, 2012; Howes et al., 2008; Lugo-Neris, Jackson, & 
Goldstein, 2010). 


Word identification 

After viewing the six videos, children were individually administered a 
receptive 12-item word identification posttest: six words in-context and 
six words in-isolation. Similar in format to the PPVT-IV, children were 
asked to point to the correct word among two other foils. Distractors 
included foils that were thematically related to the key word (e.g. key- 
word grater, distractor spatula; chef). The two contexts for vocabulary 
knowledge are designed to reflect vocabulary words that are learned on a 
continuum from simple vocabulary knowledge to greater vocabulary 
word understanding (Nagy, Anderson, & Herman, 1987; Nagy & 
Townsend, 2012). The words in-context measure captured vocabulary in 
their original video context (e.g. a screenshot of the Sesame Street mon- 
ster holding the grater). Foils were screenshots of other thematically- 
related words in the same clip. The words in-isolation measure assessed 
images of the vocabulary words not presented in the context of the video 
(ie. isolated clipart images of the target words without a background), 
along with two thematically-related distractors. Children received a score 
for correct vocabulary words, which were transformed into accuracy pro- 
portions for the analysis. 


Eye-tracker fixation duration 


To allow for a more precise analysis of how young children’s attention is 
influenced by visual-auditory congruency, we used  eye-tracking 
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technology. Eye movements were measured with a Tobii Technology 
T120 eye-tracker integrated into a 17 in. thin film transistor monitor. 
This is a remote eye-tracking system that had no contact with the child 
and has been used with young children (Neuman, Kaefer, Pinkham, & 
Strouse, 2014). After watching video clips on the eye-tracker, dynamic 
Areas of Interest (AOIs) were drawn around the visual representation of 
target words for the entire span of time the item was on screen. When, 
for example, an image of the vocabulary word, ‘grater,’ appeared on 
screen, an AOI was drawn around it to capture children’s total fixation 
duration on the grater during the video clip. These were drawn for both 
the dual-coding and non-dual-coding conditions. 


Procedure 


Children were individually administered the screening measure and 
PPVT prior to the start of the study. Following these assessments, 
trained graduate student assessors escorted children to the library one- 
by-one to watch video clips on the eye-tracking apparatus. Assessors ran- 
domly assigned children to a counterbalancing condition. Children sat 
approximately 60cm from the eye-tracking monitor. To calibrate gaze, 
an attention-grabber was shown at five points on the screen. Children 
then viewed six one-minute video clips, followed immediately by word 
identification assessments. The duration of the test took approximately 
20-25 min to complete per child. 


Analysis 


We used a repeated measures analysis of covariance (ANCOVA) with 
the dual-coding condition as the within-subjects factor, a median-split 
on LEQ and PPVT scores as between-subjects factors, and age as a cova- 
riate. Accuracy proportions on the posttests served as outcome variables. 
To examine children’s attention, we calculated the amount of time spent 
fixating on the target item on screen. Considering each clip was not 
identical in length, we created percentages for fixation duration to exam- 
ine differences in attention across condition. We also ran a regression 
model with fixation duration (i.e. visual attention), PPVT, LEQ, and age 
in months as predictors, and posttest vocabulary scores in dual-coding 
and non-dual-coding assessments as dependent variables. 


Results 


In this study, the aim was to answer two primary questions: (1) To what 
extent does dual-coding facilitate L2 vocabulary learning among DLLs? 
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and (2) What is the influence of child and parental language factors, as 
well as attention to screen, on L2 vocabulary learning for DLLs watching 
educational media? Results for each question are reported in the sections 
that follow. 


Dual-coding for dual-language learners 


To answer the first research question, we examined vocabulary outcomes 
of DLLs in two video conditions: dual-coding (visual-auditory con- 
gruence) and non-dual-coding (visual-auditory incongruence). Using 
proportions of correct answers in posttest assessments, results indi- 
cated there was a main effect for dual-coding. First, a means table 
(see Table 3) indicates that, on average, children in the dual-coding 
condition selected the correct vocabulary words in posttest assess- 
ments 67% of the time, while those in the non-dual-coding condition 
selected correct words 52% of the time. Second, running the repeated 
measures ANCOVA with dual-coding as the within-subjects variable, 
visual-auditory congruence better supported vocabulary learning 
than visual-auditory incongruence, F (1, 38) = 6.07, p = .018 (see 
Table 4). 

In this manner, when dual-coding supports existed simultaneously on 
screen, DLLs were more likely to learn the vocabulary word than if the 
visual and auditory supports occurred at different time points. For 
example, when a character in one clip describes the word ‘hurricane’ and 
provides a visual image of a hurricane as it is introduced, children dem- 
onstrated greater vocabulary gains. On the contrary, when a character in 
another clip talks about the word ‘dusk’ but delays the visual depiction 
of dusk until after the word is described, DLLs were less likely to learn 
the word. It appears that two modes of sensory input provided the scaf- 
folds needed to sustain vocabulary learning after a single viewing of edu- 
cational media. When vocabulary presentations did not include 
simultaneous auditory and visual input, DLLs were less likely to learn 
the vocabulary word. 


Parental and child language factors 


Investigating further the influence of dual-coding on DLLs’ vocabulary 
learning, we examined whether there might be interactions between the 
dual-coding condition and parental or children’s L2 abilities. We used a 
repeated measure ANCOVA with dual-coding as the within-subjects 
variable, a median-split on parental L2 language proficiency with the 
LEQ as a between-subjects variable, and a median-split on DLLs L2 
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Table 3. Means and standard deviations of vocabulary assessments (Proportion of ques- 
tions correct) and fixation durations by dual coding group. 


Dual coding Non-dual coding 
M sD M sD 
Vocabulary* 67 31 52 31 
Fixation duration 64.39 21.69 67.82 19.67 


Note. “p < .05. 


Table 4. Main effects and interactions for vocabulary learning by dual-coding condition. 


Dual-coding vs. non dual-coding condition 
Main effects and interactions 


Dependent variable Contrast F df Sig. MSeffect SSerror MSerror 
Percent posttests Dual-coding condition 6.07 1/38 .018* .235 1.47 039 
vocabulary Dual-coding*Age 531 1/38 471 021 
Dual-coding*LEQ 6.69 1/38 .014* 259 
Dual-coding*PPVT 114 1/38 737 .004 
Age 3.27 1/38 079 259 3.01 079 
LEQ .004 1/38 951 .000 
PPVT 5.85 1/38 .020* 463 
Note. “p< .05. 


language proficiency with the PPVT as another between-subjects variable 
(see Table 4). Child’s age was used as a covariate. Findings revealed that 
there was a significant interaction between dual-coding condition and 
parental L2 ability, F (1, 38) = 6.69, p = .014. In this sense, DLLs with 
low parental language proficiency or exposure to L2 in the home benefit- 
ted more from visual-sound congruence than DLLs with high parental 
language proficiency. In other words, children who were less immersed 
in an L2 environment at home particularly benefited from video clips 
that provided simultaneous visual-auditory scaffolds. 

Examining children’s L2 language proficiency, we found that there was 
no interaction between dual-coding condition and child PPVT scores, F 
(1, 38 = .11, p = .74). In other words, children’s L2 proficiency did not 
appear to predict the importance of visual and auditory input being con- 
gruent versus incongruent. Instead, the interaction was specific to paren- 
tal English language proficiency rather than child baseline 
English vocabulary. 

Lastly, we examined the influence of children’s attention to screens 
using data from the Tobii eye-tracker. We ran a regression analysis in 
both the dual-coding and non-dual-coding conditions, using vocabulary 
outcomes as the dependent variable, and attention (fixation duration per- 
centage), child’s L2 ability (PPVT), parents L2 ability (LEQ), and age as 
predictors (see Table 5). Regression findings indicated that visual atten- 
tion significantly predicted posttest scores in the dual-coding, congruent 
condition (6 = .35, p = .017). When visual and auditory cues were 
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simultaneously presented on screen, children’s attention to screen was 
associated with vocabulary gains. 

In the non-dual-coding condition, results demonstrated that visual 
attention did not predict posttest scores (f = .22, p = .19). This suggests 
that when visuals and auditory input did not match up, paying visual 
attention to the screen did not appear to predict vocabulary learning. 
Importantly, there were no differences in visual attention to screen 
between the congruent and incongruent videos, F (1, 35) = .48, p = .50, 
demonstrating that children attended to both types of visual representa- 
tions equivalently overall. 


Discussion 


The present study sought to examine whether dual-coding theory - a 
theory often used in CALL research - is a mechanism underpinning 
vocabulary learning among DLLs. More specifically, we aimed to (1) 
understand the extent to which dual-coding facilitated L2 vocabulary 
learning among DLLs; and to (2) understand the influence of child and 
parental language factors, as well as attention to screen, on L2 vocabulary 
learning in educational media. Adopting a within-subjects design and 
assigning children to two different conditions where they were exposed 
to vocabulary clips that used visual—-auditory congruent and incongruent 
input, findings suggest that a theory of dual-coding is at play for DLL 
populations. 

Building on prior research with monolingual populations (Bus, Takacs, & 
Kegel, 2015; Verhallen et al., 2006), findings from this study demonstrate 
that DLLs are more likely to learn vocabulary words in a second language 
when they were presented with simultaneous auditory and visual input com- 
pared to words presented with auditory and visual input at different times. 
Video clips with dually coded words like ‘hurricane’ and ‘grater’ (Table 1) 
enhanced children’s vocabulary knowledge, serving as compensatory scaf- 
folds. Temporally congruent visuals likely provided DLLs with the supports 
needed to make core content comprehensible (Silverman & Hines, 2009; 
Takanishi & Le Menestrel, 2017). Having access to the meaning of new 
words through visual scaffolds helps reinforce vocabulary concepts, deepen 
vocabulary knowledge, and support oral language development in young 
DLLs (Gersten & Baker, 2000; Takanishi & Le Menestrel, 2017). Scaffolds 
are critical in dual-language development because they reflect students’ 
zones of proximal development and guide learners towards deeper, more 
robust understandings of new words (Vygotsky, 1980). Educational media 
provides opportunities to learners by meeting them in their appropriate 
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Table 5. Regression statistics predicting vocabulary in the dual-coding and non dual-cod- 
ing conditions. 


Outcome Predictor t p B B F df p adj R? 
Dual-coding vocabulary Age 1.87. .070 .009 26 505 4,39 .003 29 
posttest LEQ —184 074 -—.043  —.26 
PPVT 1.94 .060 .005 27 
*Fixation Duration 2.51 .017 .004 351 
Non-dual-coding Age 93 36 .006 15 168 4,39 .18 065 
vocabulary posttest LEQ 86 40 024 14 
PPVT 1.32 .20 004 21 
Fixation Duration 1.35.19 .003 oy) 
Note. ‘p < .05. 


zones of proximal development and providing instruction that will deepen 
vocabulary knowledge. 

At the same time, not all children’s zones of proximal development lie 
within the same range. Some DLLs, for example, might have stronger L2 
vocabularies and/or greater exposure to the L2 at home than others, 
which would facilitate learning from screens. Our second research ques- 
tion investigated the influence of L2 language proficiency and environ- 
ment as contextual factors that may moderate the impact of visual- 
auditory congruence in technology-based learning environments. Using 
children’s L2 proficiency and parents’ L2 proficiency or L2 exposure in 
the home environment, findings suggest that the dual-coding mechanism 
is particularly beneficial for those with low L2 exposure in the home as 
they benefited more from visual-auditory congruent learning experiences 
than those with high L2 home exposure. By investigating DLLs who may 
have limited experiences with the English language and who are often 
underrepresented in media scholarship, this study demonstrates that edu- 
cational media has the potential to build children’s L2 vocabulary know- 
ledge through wide exposure to the second language. Interestingly, this 
was specifically tied to children’s home language environment, not their 
personal L2 vocabularies, highlighting the unique role of L2 use in 
the home. 

Consequently, media appears to be an opportunistic platform for 
learning a second language as it can deliver ample, repeated exposures of 
new vocabulary words to young DLLs. The amount of language exposure 
and language used by individuals on a daily basis is likely to affect bilin- 
gual vocabulary development at all ages. In particular, young DLLs with 
daily input and output in two languages are likely to gain proficiency in 
bilingual language performance (Bedore, Pena, Griffin, & Hixon, 2016). 
In fact, research on dual-coding theory with bilingual populations sug- 
gests that the verbal processing system may be further divided into a first 
language and second language system. When bilingual speakers hear 
words in their L1, L2, and receive a nonverbal stimulus, they are more 
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likely to remember the information than if only one language is used 
with the nonverbal stimulus (Jared, Pei Yun Poh, & Paivio, 2013; Paivio 
& Lambert, 1981). Future studies may consider examining how verbal 
coding in two languages with nonverbal images might facilitate word 
learning in media environments. 

Examining more precisely how dual-coding serves as a mechanism 
underlying learning from media, we used state-of-the-art eye-tracking 
technology to document children’s visual attention to screens. Because 
visual attention only predicted posttest scores in the dual-coding condi- 
tion, findings from this study suggest that preschoolers’ visual attention 
to screens facilitated learning only when images and sounds were 
expressed simultaneously. In other words, without dual-coding mecha- 
nisms at play, visual attention to screens did not facilitate learning of 
new vocabulary words. Besides providing viewers with information to 
process in two systems, another interpretation of these findings is that 
dually-coded presentations may attract more attention to screens, which 
is a predictor of vocabulary learning (Boers et al., 2017). As Boers et al. 
(2017) suggest, when multimodal presentations appear on screen, learn- 
ers are likely to inspect with one complex gloss than with a simple gloss, 
which is associated with better word retention (Al Seghayer, 2001). 
Extending prior research on how production techniques used in educa- 
tional media affect viewing behavior (Huston & Wright, 1983; Kirkorian, 
Wartella, & Anderson, 2008), results from this study also suggest that 
the use of formal features like zooms and other attention-directing cues 
may be beneficial for DLL populations when they are strategically paired 
with nonvisual input. 

The present study contributes to our understanding of how dual-cod- 
ing theory can be applied to under-researched DLL populations. 
However, there are limitations of the study to acknowledge. First, the 
sample size of the study examines the experiences of 43 DLLs. While 
small samples might compromise the generalizability of the findings, 
adopting a within-subjects design limits between-subjects variability, 
which reduces error and increases power to detect potential differences 
between conditions. Similarly, not all language groups are represented in 
the sample. The majority of students were Spanish speakers, which limits 
the generalizability of results to all DLLs as children who speak another 
language may require different supports. Third, creating multimedia 
video clips that are perfectly comparable to one another with equally dif- 
ficult vocabulary words is a limitation. We note that only six video clips 
and vocabulary words were used in the study, but recognize this was 
necessary to accommodate the attention spans of four-year-old children 
who were screened, pretested, eye-tracked, given videos to view, and 
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then post-tested. We attempted to manipulate clips so that they were 
relatively equal in video length, word repetitions, and difficulty. Again, 
using a within-subjects design and assigning children to condition in a 
counter balanced fashion allowed us to increase internal validity and 
wash out some of the differences between video clips. Moreover, video 
clips were chosen from currently available programs in the media 
marketplace, enhancing the ecological validity of the video clips. Because 
video clips were chosen from programs that are available to children, we 
were not able to select words that were perfectly comparable to one 
another. We used the CHILDES database to select words that appeared 
less frequently in the speech of four-year-old children. However, word 
frequency may not always indicate if a word is rare or indicate whether 
a word has polysemous meanings. Future studies would benefit from 
using multiple indicators to determine word difficulty. In the same vein, 
we used only one educational media program - Sesame Street - in the 
study, which may offer some limitations to the generalizability of this 
study to all educational media. Still, Sesame Street has been studied for 
decades by scholars around the globe (Fisch & Truglio, 2014) and using 
the same program allowed us to control for a program effect. 

In sum, this CALL-situated study provides evidence to suggest that 
dual-coding is a key mechanism underlying L2 vocabulary learning for 
DLLs on screen; that this mechanism is particularly beneficial for those 
with low L2 exposure in the home as they benefited more from media 
that used visuals and sounds together, rather than those with high L2 
exposure; and that preschoolers’ visual attention to screens facilitated 
vocabulary learning only when images and sounds were expressed simul- 
taneously. Findings suggest that congruent visual and auditory sources of 
input may serve as important compensatory scaffolds that facilitate the 
teaching and learning of L2 vocabulary on screens. Moreover, these 
screen-based scaffolds might support teachers in the classroom as addi- 
tive tools that leverage technology to promote early bilingual develop- 
ment. Technology is ubiquitous in the lives of young children, with 
educational media in the palms of young hands all over the world. With 
the potential to access far-reaching households and provide DLLs with 
broad exposure to vocabulary words in a second language, the current 
study establishes that educational media has the ability to equip young 
DLLs with early lexical skills that promote a new or additional language, 
preparing them for the increasingly multilingual world. 
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